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ABSTRACT 

We propose a novel approach to background cor- 
rection for Infinium HumanMethylation data to 
account for technical variation in background fluor- 
escence signal. Our approach capitalizes on a new 
use for the Infinium I design bead types to measure 
non-specific fluorescence in the colour channel 
opposite of their design (Cy3/Cy5). This provides 
tens of thousands of features for measuring back- 
ground instead of the much smaller number of 
negative control probes on the platforms (n = 32 for 
HumanMethylation27 and n = 614 for Human- 
Methylation450, respectively). We compare the per- 
formance of our methods with existing approaches, 
using technical replicates of both mixture samples 
and biological samples, and demonstrate that 
within- and between-platform artefacts can be sub- 
stantially reduced, with concomitant improvement 
in sensitivity, by the proposed methods. 



INTRODUCTION 

Epigenetic marks play a central role in gene regulation (1). 
Perhaps the two most widely studied epigenetic marks are 
histone tail modifications and DNA methylation (in par- 
ticular, 5-methylcytosine). Bisulfite conversion of unmethyl- 
ated cytosines to uracils, followed by amplification, 
allows existing genotyping and sequencing technologies 
to be repurposed for studies of DNA methylation. Small 
(sub-microgram) amounts of DNA from hundreds of 
samples can be assayed at thousands of sites throughout 
the genome using high-density microarrays (2), for a 
fraction of the cost of whole-genome bisulfite reseq- 
uencing. As with all microarray technologies, however, 



signal processing is an important consideration in obtain- 
ing consistent, reproducible results (3). 

The lllumina BeadArray platform, in the form of the 
HumanMethylation27 (HM27) and HumanMethyla- 
tion450 (HM450) assays, is widely used to study genomic 
DNA methylation. At each targeted cytosine position, the 
fluorescence intensities of the methylated and unmethylated 
signals are measured. DNA methylation level is estimated 
by the ratio of DNA methylation intensity over the total, 
M/(M+U), where M and U denote the average fluorescent 
signals from the methylated and unmethylated bead types, 
respectively. lllumina adds a stabilizing constant of 100 in 
the denominator, and calls the ratio the average Beta value. 
A benefit of Illumina's HM450 BeadArray design is that for 
each probe sequence, a median of 14 beads are randomly 
distributed on the array, with each bead containing 
hundreds of thousands of oligonucleotides. The HM27 
measures even more beads per probe. Further, lllumina 
organizes arrays into 12-sample BeadChips offering great 
potential for automation. However, as with the analysis of 
gene expression BeadChips, the sequential processing of 
samples can give rise to specific types of technical artifacts. 
Potential issues include the variation in background fluor- 
escence (4), the positional effects on a single BeadChip (5) 
and efficiency of bisulfite conversion (6). We focus on the 
issue of array-to-array variation in non-specific (back- 
ground) fluorescence. 

In microarray studies, background fluorescence appears 
to contribute an independent additive error to the measure 
of signal. The observed intensity (also known as fore- 
ground, X f ) is modelled as the sum of true signal (X s ) 
and background (X b ), X f = X s + X b . An additive bias on 
the intensity translates to a reduced dynamic range for the 
Beta value. The goal of background correction methods 
is to estimate the true signal from the observed fore- 
ground. The primary methods currently available for 
HumanMethylation arrays use a simple background 
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subtraction, a method that suffers from truncation of the 
data at low intensity signals (e.g. GenomeStudio 2011.1 
and 'lumi' package in Bioconductor 2.10). The de facto 
standard for expression microarray background correc- 
tion is the normal-exponential convolution, with the de- 
sirable feature of smooth interpolation, rather than 
truncation, of low intensity signals (7). We evaluate the 
use of convolution models for background correction of 
DNA methylation arrays, considering three models. First, 
we adapt the popular normal-exponential convolution 
model to the two-colour BeadArray design; second, we 
introduce a novel Gamma convolution model, a general- 
ization of both the normal-exponential and Gamma- 
exponential (8); third, we adapt a distribution-free 
approach previously proposed for Affymetrix 
GeneChips. For each of the convolution models, we cap- 
italize on the unique design of the Infinium I chemistry 
DNA methylation probes, to vastly increase the number 
of features on the array from which we measure back- 
ground fluorescence. We find that convolution models 
that use the Infinium I chemistry probes to measure back- 
ground improve the consistency of measures at identical 
target sequences from multiple arrays of the same biolo- 
gical samples, and show less bias than other methods in 
estimates of percent DNA methylation, as determined 
from calibrated mixture samples. 

MATERIALS AND METHODS 

Illumina HumanMethylation27 and 
HumanMethylation450 BeadArrays 

The Illumina HM27 array interrogates 27 578 targeted 
CpGs. Each array includes 144 control bead types, 32 of 
which are 'negative' controls designed to not match any 
genomic regions and thus measure background fluores- 
cence. The HM450 array interrogates 485 577 targets, of 
which 482 421 are CpGs (>99.3%), 3091 are CpHs (0.6%) 
and 65 are SNPs (<0.1%). A total of 850 control bead 
types are assayed, 614 of which are 'negative' controls. Six 
hundred of the negative controls are used by 
GenomeStudio for background correction. 

The bead types on the array use two probe chemistries 
(Infinium I and Infinium II) and two colour dyes 
(Cy3-Green/Cy5-Red) (2). Infinium I probes are paired 
50-base oligonucleotides, with one terminating directly 
across from a methylated 'C and the second from an 
unmethylated 'C. The extension base fluorophore 
dictates the colour channel (Cy3 for 'C and Cy5 for 'A' 
or 'T'), and is the same for both the methylated and 
unmethylated probes within a pair, so that both probes 
fluoresce at the same wavelength. Infinium II probes have 
a single 49-base oligonucleotide that hybridizes to the 
bisulfite-converted target sequence in a non-DNA methy- 
lation-dependent fashion upstream of the target CpG. The 
DNA methylation of the interrogated CpG is measured by 
extension of either a Cy3-linked base for methylated sites 
or a Cy5-linked base for unmethylated sites. These probes 
are designed to fluoresce at either wavelength (Cy3 for 
methylated, Cy5 for unmethylated). All probes on the 
HM27 array use the Infinium I chemistry; the HM450 



array uses a combination of both (30% Infinium I and 
70% Infinium II). 

As probe pairs using the Infinium I chemistry fluoresce 
at the same wavelength, all fluorescence from probes of 
this design at the wavelength of the opposite 
fluorophore — that which is not the extension base — can 
be used to estimate non-specific fluorescence across the 
array. All 27 578 probes on the HM27 array and 135 501 
(30%) of the probes on the HM450 array are of this type, 
and provide fluorescence signals in the opposite channel 
from probe design ('out-of-band'). On the HM27 array, 
15 500 Cy3 and 12 078 Cy5 probe pairs yield 31000 and 
24 1 56 features in each channel, respectively, numbers in 
vast excess of the 32 'negative' controls on the platform. 
The corresponding numbers for the HM450 array are 
92 596 Cy3 and 178 406 Cy5 features for estimating 
non-specific fluorescence. We use these sets of features 
when correcting for background using the convolution 
models described below. 

Data extraction 

The Illumina GenomeStudio software does not provide 
signal intensities for Infinium I probes in the opposite 
channel from the designed fluorophore. Therefore, we 
enhanced routines originally implemented by Dr Keith 
Baggerly, and later incorporated into the 'crlmm' 
package in Bioconductor (9) to recover this 'out-of- 
band' information from the binary .IDAT files produced 
by Illumina scanners. These procedures, and all processing 
methods described below, are available through 
Bioconductor in the package 'methylumi'. 

Background correction 

For all background-correction methods, probes are 
corrected within each colour channel, after pooling all 
types of specific probes assayed in that channel. The 
green channel probes include all Cy3 channel Infinium I 
probe pairs and the methylated cytosine Infinium II 
probes on the HM450 arrays. The red channel probes 
include the Cy5 channel Infinium I probe pairs and the 
unmethylated cytosine Infinium II probes. For clarity of 
presentation, the colour channel information is not noted 
for the methods detailed below. 

Table 1 provides a summary of all the background- 
correction approaches. Three convolution models are 
considered. 

i. Normal-exponential convolution. This method is 
described in detail by Ritchie et al. (7). Here we review 
the model, facilitating below the introduction of the more 
flexible Gamma convolution. Let X b ~N(u,a 2 ) and 
X s ~Exp(y), and the observed foreground intensity 
X f = X s + X b . We estimate parameters from the 
background distribution using the control probes, and 
the signal parameter y from the observed foreground 
intensities with the background mean subtracted 
(Xf — fi). The conditional expectation of the signal given 
the observed foreground and background is computed by, 
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Table 1. Methods to correct for background fluorescence on HumanMethylation BeadArrays 



Name 


Method 


Controls 


Distributions 


noob 


normexp (performs Normal-Exp convolution) 


Out-of-band intensities 


Y, ~ A/Y (/ rr^.\ 
A/, ~ N{(1,0 j 

X s ~Exp(y) 


goob 


Gamma convolution 


Out-of-band intensities 


x h ~ r(s,p) 

X s ~T(y, a ) 


doob 


Distribution-free convolution 


Out-of-band intensities 


None 


normexp 


normexp 


Negative controls 


X,, ~N(fi,a 2 ) 
X s ~ Exp(y) 


dfcm 


Distribution-free convolution 


Negative controls 


None 


q5 


Subtract fifth percentile 


Negative controls 


None 


lumi 


lumi (performs background subtraction) 


Methylated probe allele intensities 


None 



Offset of 15 is added to x s or E[X s |X f ]. 



where fj, s ,f=Xf—fi — a 2 /y, <p(-) the standard normal 
density and O the cumulative normal distribution. Details 
of the estimation procedure are given in (7). Using the 
conditional expectation smoothly interpolates probes with 
intensities near the background level. We evaluate the 
approach using two separate populations of control 
probes: (a) negative control probes ('normexp') and (b) 
'out-of-band' Infmium I probes ('noob' = Normal- 
exponential using out-of-band probes). 

ii. Gamma convolution. We developed and implemented 
a Gamma-Gamma convolution, with background 
intensity distribution Gamma(§,(3) and signal abundance 
distribution Gamma(y,a). The model was chosen to 
restrict the range for the background fluorescence 
intensities to positive values, and to include the 
exponential signal distribution as a special case. 
Assuming mild restrictions on the shape and scale 
differences between two Gamma-distributed random 
variables, their sum can be modelled as approximately 
Gamma-distributed as well (http://ideas.repec.Org/p/dgr/ 
kubcen/200770.html). The signals for background are 
estimated using the out-of-band intensities, and the 
signal abundance parameters from the background 
subtracted foreground measures after forcing observations 
less than one to one [ = max(Xf-§p 2 ,l)]. The conditional 
expectation of the signal given foreground is given by 



E[X s \X f ] = [ 
Jo 



t>- 



t dt, 



where B(y,§) is a Beta distribution and iFi(a,b,x) is the 
confluent hypergeometric function. 

A fast maximum likelihood estimator (http: //research. 
microsoft.com/en-us/um/people/minka/papers/minka- 
gamma.pdf) is used to estimate the parameters from the 
data, and numerical integration is then performed to 
obtain the conditional expectation of the signal given the 
foreground and background intensities. To avoid 
instability in the numerical integration step, we use a 
simple heuristic: if an observation is greater than three 
standard deviations above the mean of the background, 
we simply subtract the background mean. In simulations, 
differences between the integral computed to arbitrary 



precision via the 'mpmath' package, and the 
approximation we use, were so small as to vanish at 
64-bit machine precision. We call this method 'goob' for 
Gamma out-of-band background correction. 

Hi. Distribution-free convolution model. We implement a 
variation of the distribution-free background correction 
approach in (10), adapted to use control bead types 
from the Illumina platform. Let 




if Xf > /2+2<r 



where /2 is the mean of the control bead types, a is ~Jl 
times the sample standard deviation of the intensities 
smaller than /2, a robust measure of standard deviation 
for the skewed noise distribution and min is the minimum 
intensity Xf. For small xj, the correction is a linear 
interpolation with slope defined by the minimum intensity 
value, and the background mean and robust estimate of 
standard deviation. For large Xf, the correction is 
background subtraction. We evaluate this method using 
two different populations of control probes: (a) negative 
control probes ('dfcm' = distribution-free convolution 
model) and (b) 'out-of-band' Infinium I probes 
('doob' = distribution-free using out-of-band probes). 

The convolution models above are compared with the 
two background subtraction methods currently available 
for this platform. The first is the method documented in 
the most recent release of Illumina's GenomeStudio 
software (ver2011.1), subtracting the fifth percentile of 
the negative control probe intensities on the array. The 
second is the method implemented by the 'lumi' package 
in Bioconductor version 2.10, subtracting the mode of the 
lower half of the distribution of methylated probe 
intensities. For both of these approaches, the maximum 
of the background-subtracted value and one is returned. 

All methods (see Table 1 for a summary) used an offset 
of 15, which was selected based on earlier results for gene 
expression arrays (11), and evaluation of the probe- 
specific variance for Beta values from technical replicates. 
Also, for all HM450 data, we apply an additional global 
dye-bias equalization step to control for the different 
average intensities in the red and green channels. This 
procedure scales the background-corrected intensities, 
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dividing by the average intensity of the positive control 
probes in the same channel, red or green, and multiplying 
by the average intensity of all positive controls in a 
reference array. The reference array is selected as the 
one with the smallest difference in average red and 
average green positive control intensity. This is similar 
to the approach implemented in GenomeStudio 
(ver2011.1), but can select a different array as the 
reference. 

DNA methylation summary measures 

We evaluate the effects of different background-correction 
methods on the bias and variance of DNA methylation 
estimates. The background-corrected and dye-bias 
equalized values x s are separated into the intensity 
estimates, m s and u s , for the methylated and unmethylated 
cytosines, respectively, and two summaries of DNA 
methylation are evaluated: (i) Beta value = m s /(m s + u s ), 
a value between 0 and 1, and (ii) M -value = log 2 (m s /u s ), 
the logit transformed Beta value ( = log 2 (Beta/(l— Beta)) 
with unrestricted boundaries (12). 

Illumina HumanMethylation data sets 

Four data sets were used to compare the benefits of 
differing background-correction methods for DNA 
methylation arrays. We used technical replicates to 
evaluate array-to-array variation for each probe, and 
calibrated mixture samples to evaluate bias and root 
mean squared error in estimating absolute percent DNA 
methylation, or the corresponding M-value. Finally, we 
evaluated the methods using two real data sets that 
contain replicate sample measurements. The real data 
studies allow us to evaluate the technical variation for 
large numbers of independent samples for two different 
cell types. 

Technical replicates 

Six technical replicates (1 ug genomic DNA each, 
extracted from commercially available pooled human 
male peripheral blood lymphocytes, or PBLs) were 
assayed as part of a larger experiment (13). Each 96-well 
plate of the study included the random placement of one 
technical replicate of the control PBL. Bisulfite conversion 
and whole genome amplification of the extracted DNA 
were performed by plate as described by Sandoval et al. 
(2); and samples were run on HM450 arrays. 

Calibrated mixture samples 

Four proportions of M.S.wI-treated sperm DNA (10, 35, 
60, 85%) were prepared by titrating equal concentrations 
of treated and mock-treated sperm DNA to achieve near 
uniformity of DNA concentration at the desired 
proportions. One microgram of DNA from each of the 
four mixture proportions was then randomly arranged 
on each of four 96-well plates and analysed on the 
HM450 arrays, as part of the same study as the technical 
replicates described above. As the M.SssI enzyme fully 
methylates DNA, the expected DNA methylation 
proportion at normally unmethylated loci should equal 
the fraction of M.SM-treated sperm in the mixture 



(0.10, 0.35, 0.60 and 0.85). Histograms of percent DNA 
methylation for the 10 and 35% mixture samples showed 
clear bimodal distributions (plots not shown), allowing us 
to distinguish normally unmethylated loci from a subset of 
'constitutively methylated loci', loci that would not reflect 
the proportion of M.S.M'I-treated DNA in the mixture. 
From these bimodal distributions, we selected 0.65 as a 
cut-off. Probes yielding a median beta value >0.65 in the 
10% M.SM sample are omitted from all mixture samples 
as constitutively methylated. 

Acute myeloid leukaemia patient samples 

As part of The Cancer Genome Atlas (TCGA) project, 
1 (.ig of DNA from each of 192 acute myeloid leukaemia 
(AML) patient samples was assayed on the Illumina 
Infinium HM27 platform. To interrogate a greater 
variety of sites, DNA from the same patient samples 
was again assayed on the Illumina Infinium HM450 
platform. A total of 25 978 (94%) of the target sequences 
on the HM27 array are retained on the HM450 array. As 
AML is a notoriously diverse disease with significant 
epigenetic heterogeneity (14), the data allowed us to 
compare the ability of the various background-correction 
methods to minimize differences between platforms, 
without diminishing biological differences between 
patients. A one-way ANOVA was conducted to partition 
variance between subjects from variance between each 
subject's two data sets (one HumanMethylation27, one 
HumanMethylation450) and identify methods to best 
improve the sensitivity of the Infinium platform for 
measuring biologic (between-subject) variation. The 
precision of the different background-correction 
methods was assessed by a comparison of the distribution 
of ANOVA F-statistics. Since we know SNPs at the CpG 
site can induce a DNA methylation signal across samples 
(15), we excluded probes with SNPs overlapping a CpG 
site (« = 2238, dbSNP build 135). All 384 samples for the 
AML patients are available as part of TCGA. 

HapMap sample data 

Jordana Bell and colleagues kindly provided 160 samples 
of raw .IDAT files of HM27 data for 77 HapMap 
lymphoblastoid cell lines (16), processed at the 
University of California, Los Angeles. Seventy-two 
samples were run in duplicate, four in triplicate, and one 
in quadruplicate. We processed these data with the 
methods presented above and performed a one-way 
analysis of variance to compare the sensitivity of the 
different approaches on an independent data set. Again, 
probes with SNPs occurring at the CpG site were omitted. 
The .IDAT files are now available from the Gene 
Expression Omnibus as GSE26133. 

RESULTS 

Distribution of non-specific fluorescence 

Figure 1 shows the distribution of intensities for both the 
negative control probes and the Infinium I probes in the 
colour channel opposite their designed extension base 
('oob') for four PBL replicates, two on the HM27 array 
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Figure 1. Distribution of negative control and out-of-band fluorescent probes for two PBL replicate samples on the HM27 and HM450 arrays. 



(left-hand side) and two on the HM450 array (right-hand 
side). The two samples on each platform are selected to 
show the variation in background intensity that can arise 
between replicates of the exact same sample. Note that the 
number of Infmium I probes that provide information on 
out-of-band intensities is orders of magnitude larger than 
the number of control probes. This motivates our use of 
the opposite-channel fluorescence, rather than negative 
control probes, for correcting background effects. The 
mean of the background tends to be equally well 
captured by either probe population, but variances from 
two distributions are considerably different, and therefore 
warrant individual consideration. The Gamma model for 
non-specific fluorescence directly accommodates the skew 
in distributions of 'oob' probes, while the nonparametric 
'doob' procedure makes no distributional assumptions 
at all. 

Low-level signal processing 

The substantial technical variation in raw signals from 
Illumina Infinium HumanMethylation450 microarrays, 
and the improvement yielded by low-level processing, is 
shown in Figure 2. For six replicate samples of PBLs, 
densities of the methylated allele intensities, unmethylated 
allele intensities and Beta values are shown (top row to 
bottom row), before and after processing (left and right 
columns, respectively). Processing included both 'noob' 
background correction (Normal-exponential convolution 
using out-of-band probes) and dye-bias equalization. A 
contributing factor to the differences between profiles 
reflects the differences in background distribution 
between identical samples run on different plates. 
Background fluorescence has the effect of shrinking the 
dynamic range of Beta values, and dye bias variations 
skew the values of Infinium II probes differentially 
across samples. By correcting for both, we spread 
the peaks closer to the extremes of the distribution. 



The median pairwise Spearman correlation between Beta 
(or M-) values of uncorrected replicates was 0.9798; for 
the corrected replicates, the median correlation was 
0.9897. 

The parametric convolution models estimate the signal 
intensity distribution from the pool of background- 
subtracted methylated and unmethylated allele intensities 
(by colour channel). For the PBL sample intensities shown 
in Figure 2, the shape parameter for a Gamma 
distribution, which is equal to one under the exponential 
model, is estimated to be less than one under the Gamma 
convolution. This reflects a signal distribution that is even 
more highly skewed, and with greater variance, than the 
exponential distribution assumed under a Normal 
exponential convolution. A quantile-quantile plot shows 
the observed and estimated distributions for the fore- 
ground and background intensities from a randomly 
selected PBL sample, analysed using a Normal- 
exponential, or Gamma-gamma model (see Supple- 
mentary Figure SI). The estimates from the Gamma 
convolution show a better fit to the observed intensities 
than the estimates from the Normal-exponential for the 
middle 95% of the distribution. A better fit of background 
intensity is also seen by modelling a Gamma distribution 
compared with a Normal. 

Probe-wise variance in replicate samples 

Figure 3 shows the probe-wise standard deviation of the 
Beta values, and M-values, for the different background- 
correction methods. We find that methods using only 
negative control probes to measure background can 
have the undesirable property of increasing the probe- 
wise standard deviation for the technical replicate PBL 
samples on the M-value scale. The median variance is 
minimized by the newer convolution methods using the 
out-of-band intensities to estimate background, with 
'goob' showing the clear advantage on the Beta-value 
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Figure 2. Density plots for six PBL replicate samples on the HM450 arrays before and after processing (left and right columns, respectively) 
(P = 482421 targeted CpGs). Processed data were corrected by the Normal exponential convolution using out-of-band Infinium I probes (noob) 
and dye-bias equalization. Row 1: Methylated allele intensities; row 2: Unmethylated allele intensities; row 3: Beta values. 



scale, while the 'noob' procedure performs best on the 
M-value scale. Linear modelling of the probe-level 
variation as a function of the number of overlaid CpG 
sites per probe revealed statistically significant 
(P<2.2x 10~ 16 ) decreases in probe-level variation with 
each additional overlaid CpG locus; this relationship 
was present in both Infinium I and Infinium II probe 
designs, albeit with larger magnitude in the former. 

Bias, variance and RMSE as measured from 
mixture samples 

We next quantified the probe-level bias and variance trade 
offs presented by the various background-correction 



methods using a panel of samples with titrated mixtures 
of DNA methylation. Specifically, we computed the bias 
as the difference between each sample's Beta values and its 
titrated concentration of M.tol-treated DNA, or its 
M-values and the logit-transformed concentrations. We 
find that the bias shows a strong dependency on the 
concentration of M.fel-DNA, with more pronounced 
differences towards the extremes of 10% (mostly 
unmethylated) and 85% (mostly methylated) (Figure 4 
and Supplementary Figure S2). For these concentrations, 
the bias is most reduced using the convolution models. We 
evaluated the probe-level variation in measurements using 
the sample standard deviation, whereupon a strong 
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Figure 3. Boxplot of the probe-specific standard deviation of (a) Beta values and (b) M-values by background-correction method: none (raw), 
background subtraction using negative control probes (q5), background subtraction using methylated allele intensities (lumi), Normal exponential 
convolution using negative control probes (normexp), distribution-free convolution using negative control probes (dfcm), distribution-free 
convolution using out-of-band Infinium I probes (doob), Normal exponential convolution using out-of-band Infinium I probes (noob), Gamma 
convolution using out-of-band Infinium I probes (goob) (;; = 6 PBL replicates on HM450, P - 482 421 targeted CpGs). Dye-bias equalization is also 
applied for all background-corrected data. 
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Figure 4. Boxplot of probe-specific bias in M-values by background-correction method: none (raw), background subtraction using negative control 
probes (q5), Normal exponential convolution using negative control probes (normexp), distribution-free convolution using negative control probes 
(dfcm), distribution-free convolution using out-of-band Infinium I probes (doob), Normal exponential convolution using out-of-band Infinium I 
probes (noob), Gamma convolution using out-of-band Infinium I probes (goob) [n = 4 replicates each of four mixtures samples, P = 255 594 CpG 
targets, filtering out features methylated in the 10% M.&.?I-treated fraction (average beta > 0.65)]. Dye-bias equalization is also applied for all 
background-corrected data, and probes are stratified by Infinium I and II design. 
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improvement was observed for all background-correction 
methods in the more numerous Infinium II probes, but 
Infinium I probes showed slight to moderate increases in 
probe-level variation (Figure 5 and Supplementary 
Figure S3). In evaluating the raw data, the Infinium II 
probes showed greater variability than Infinium I 
probes, but after background correction, this was no 
longer the case. In fact, for mixtures of 10-60%, the 
variability of the DNA methylation measures for 
Infinium II probes was lower than the variability for the 
Infinium I probes. 

We also tabulated the root mean squared error 
(RMSE), a function of both variance and bias, across 
the different mixing fractions. Near the extremes (where 
most measurements in biological samples are expected to 
fall), we find that the model-based convolutions using 
out-of-band probes, 'noob' and 'goob', generally 
perform best (Figure 6 and Supplementary Figure S4). 
We note, however, a slight increase in RMSE 
after background correction for probes with 35 or 60% 
DNA methylation, with 'noob' representing the smaller 
increase in RMSE between 'noob' and 'goob'. Thus, the 
models suffer mildly for probes that do not have one 
of their two alleles (M or U) near background. We 
omitted from these comparisons the method used by 
the lumi package, as it relies on the assumption that a 



large number of probes on the array are unmethylated, 
which is violated for these samples containing M.S.wI- 
treated DNA. 

Performance in replicated HapMap and TCGA samples 

In the HapMap cell line replicates (all Infinium I probes 
on HM27 BeadArrays), all background-correction 
methods appear to improve the ability of the platform 
to detect biological differences, as judged by the 
distribution of F statistics (Figure 7a). This illustrates 
that, regardless of the specific method used, the differences 
between cell lines are more prominent after processing, 
relative to the differences within replicates of the same 
cell line. A comparison of within-subject and between- 
subject standard deviations shows 'goob' as the method 
that reduces the within-subject variation the most 
(Supplementary Figure S5). The tendency of lumi's 
mode subtraction scheme to increase within-cell-line 
variance relative to the raw data is of concern. 

In the AML replicate comparison (Figure 7b), only the 
out-of-band convolution-based methods avoid decreasing 
sensitivity in the distribution of the F-statistics, with only 
'noob' providing a significant improvement in a cross- 
platform comparison. When comparing across platforms, 
21497 (91%) of the 23 740 targeted sequences appearing 
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Figure 5. Boxplot of probe-specific standard deviation in M-values by background correction method: none (raw), background subtraction using 
negative control probes (q5), Normal exponential convolution using negative control probes (normexp), distribution-free convolution using negative 
control probes (dfcm), distribution-free convolution using out-of-band Infinium I probes (doob), Normal exponential convolution using out-of-band 
Infinium I probes (noob), Gamma convolution using out-of-band Infinium I probes (goob) [n = 4 replicates each of four mixtures samples, 
P = 255 594 CpG targets, filtering out features methylated in the 10% M.&.sl-treated fraction (average beta > 0.65)]. Dye-bias equalization is also 
applied for all background-corrected data, and probes are stratified by Infinium I and II design. 
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Figure 6. Boxplot of probe-specific root mean squared error (RMSE) of M-values by background-correction method: none (raw), background 
subtraction using negative control probes (q5), Normal exponential convolution using negative control probes (normexp), distribution-free 
convolution using negative control probes (dfcm), distribution-free convolution using out-of-band Infinium I probes (doob), Normal exponential 
convolution using out-of-band Infinium I probes (noob), Gamma convolution using out-of-band Infinium I probes (goob) [n = 4 replicates each of 
four mixtures samples, P = 255 594 CpG targets, filtering out features methylated in the 10% M.&.sl-treated fraction (average beta > 0.65)]. Dye-bias 
equalization is also applied for all background-corrected data, and probes are stratified by Infinium I and II design. 
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Figure 7. (a) Boxplot of probe-specific ANOVA F-statistic of M-values for 160 HM27 arrays of 77 HapMap samples (72 duplicates, 4 triplicates and 
1 quadruplicate) by background-correction method: none (raw), background subtraction using negative control probes (q5), background subtraction 
using methylated allele intensities (lumi), Normal exponential convolution using negative control probes (normexp), distribution-free convolution 
using out-of-band Infinium I probes (doob), Normal exponential convolution using out-of-band Infinium I probes (noob), Gamma convolution using 
out-of-band Infinium I probes (goob). (P = 25913 CpG targets, filtering out features with SNPs at targeted CpGs). (b) Boxplot of probe-specific 
ANOVA F-statistic of M-values for 192 acute myeloid leukaemia samples from the Cancer Genome Atlas project, run on both HM27 and HM450 
arrays, arranged by background-correction method (left to right): none (raw), q5, lumi, normexp, doob, noob, goob. See part (a) for full name of 
method. (P = 23 740 features overlapping both platforms, SNPs omitted). There are 2243 (9%) features using chemistry I on both platforms, and 
there are 21497 (91%) features using chemistry I on HM27 and chemistry 2 on HM450, among those probes shared between the two platforms. 
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on both platforms are measured using the differently 
designed Infmium II probes on the HM450 platform. As 
the Infmium II probes have additional dye bias correction 
that does not affect Infmium I probes, we stratify the 
comparison of within- and between-subject variation by 
the design of the targeted sequence on the HM450 array. 
We observe that the out-of-band convolution methods 
decrease overall and within-subject variance in probes 
not switching chemistry between platforms (Infmium I 
probes) (Supplementary Figure S6). Among the probes 
that do switch chemistry, 'noob' shows greater between- 
subject variability than the unprocessed data, and shows 
the least variance inflation of the background-correction 
methods compared. The combination of these two events 
likely explains the increased ability of the assay to detect 
biological differences as compared with the raw data or 
other methods under study. 



DISCUSSION 

We present three novel deconvolution methods for 
correcting background fluorescence on Illumina Infmium 
DNA methylation arrays, all of which benefit from using 
the large number of out-of-band intensities provided by 
the platform's design. The out-of-band intensities of the 
Infmium I probe pairs provide an extremely large feature 
set for measuring the additive noise component. The 
distribution of the non-specific fluorescence of these 
features is markedly non-Gaussian in many samples; 
thus, we implement a flexible Gamma convolution 
model as a competitor to the well-regarded Normal- 
exponential convolution. Both the Normal exponential 
and Gamma exponential convolution models are special 
cases. We also evaluated a distribution-free convolution 
model, in part to assess the importance of model fit as a 
component of performance. Our results indicate that 
deconvolution methods operating on out-of-band 
probes, as a group, outperformed subtractive methods 
and methods using the designed negative controls on 
the array. In addition, the model-based deconvol- 
ution methods appeared to perform better than the 
distribution-free approach. However, no one single 
model emerged as the uniformly best performing method 
across all comparisons. 

Background correction has the greatest impact on 
signals that are near background, or correspondingly, 
DNA methylation proportions near 0 and 1. Indeed, the 
mixture samples showed that the greatest reduction in bias 
and RMSE from low-level processing appeared at the 
extremes of the Beta distribution, in mixtures of 10 and 
85% M.fc/-treated DNA, especially for Infmium II 
probes (which also benefitted from dye-bias equalization). 
The Gamma convolution showed the greatest reduction in 
RMSE for Beta values for these samples, with the Normal 
exponential a close second. At the same time, the 35 and 
60% mixture samples, corresponding to probes for which 
both the methylated and unmethylated intensities are 
above background, showed a slight inflation of variance 
from low-level processing. For these mixtures, the Normal 



exponential convolution showed less variance inflation 
than the Gamma convolution. 

The observation that some probes benefit from 
processing whereas others suffer inflated variance 
suggests that the distribution of the DNA methylation in 
our samples may impact the relative overall performance 
of different processing methods. This might explain the 
superior performance of the Gamma convolution for the 
HapMap lymphoblastoid cell lines that were processed on 
the HM27 arrays, with a preponderance of probes in 
promoter CpG islands measuring low levels of DNA 
methylation. For the TCGA AML replicate comparisons, 
the results differed. As the AML samples come from 
subjects harbouring significant subclonal populations 
(17), and the disease is associated with profound 
genome-wide changes to DNA methylation (18), we 
hypothesize that the observed shifts away from the 
extremes and towards intermediate levels of DNA 
methylation reduced the impact of background correction. 
Stark differences between the AML patients are recovered 
regardless of the processing method used. 

Our background correction and dye-bias equalization 
processing of the data tends to increase the dynamic 
range of the Beta values, most notably for Infmium II 
probes. This resembles the empirical spreading of the 
Infinium II probe distribution proposed by Dedeur- 
waerder et al. (19), though the spreading of peaks occurs 
in our method as a natural consequence of background 
correction, and affects both Infinium I and II designed 
probes. Further stratification and incorporation of probe 
sequence characteristics, both those of the out-of-band 
Infinium I controls and those of the analytic probes of 
either design, may yield greater improvement. 

The distribution of the tens of thousands of Infinium I 
self-controls for measuring background largely matches 
the designed negative controls, but provides many times 
the sample size for estimating distributional parameters. 
Additionally, the probe sequences for the paired probes 
are available, allowing a number of sequence-based 
refinements that are not possible with the negative 
control probes because their sequences are not made 
available. The presence of these probes, and their large 
number and known sequence properties, may be of use 
in supervised normalization procedures, in addition to 
the uses we demonstrate in unsupervised processing. 

Although supervised and unsupervised methods for 
normalization are of great interest for these arrays, we 
performed only the amount of processing necessary to 
make the two platforms (HM27 and HM450) comparable 
for the AML replicates. Better dye-bias correction and 
within-array normalization methods are of interest, but 
they are beyond the scope of this report. However, we 
did evaluate our noob and dye-bias equalization 
procedure in the presence of SWAN (20), a new 
software package that performs subset quantile nor- 
malization, and found that our method always improved 
the median pairwise Spearman correlation for all mixture 
samples and the PBL replicates. This leads us to believe 
that our methods will stand up to downstream processing. 
Another consideration for method improvement is the 
optimal offset value. For all signals, we used an offset of 
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15, the offset reported by Shi and colleagues (11) for gene 
expression data in their investigation of bias and 
sensitivity. However, as the addition of an offset affects 
Beta values and M-values differently, it could be that 
different values are preferred depending on the choice of 
scale. This remains to be explored. 

In summary, we find that the use of out-of-band 
intensities for estimating the parameters of a convolution 
model for background correction on Illumina Infinium 
methylation microarrays outperform subtractive appro- 
aches for background correction. However, as the corr- 
ection appeared to slightly increase variance of probes 
with Beta values in the interior (35 and 60% mixture 
samples), refinements are likely to see further improvements. 
Thus, the overall improvement of the method over 
competing approaches is primarily due to the distribution 
of beta values in experimental samples, falling primarily at 
the extremes of the scale. Data processing using the Normal 
exponential model is fast and computationally efficient; the 
Gamma convolution, in its present implementation ('goob'), 
requires ~ 10-1 5 min per HM450 sample on a 2Ghz Intel 
processor. Future work will investigate stratified and allele - 
specific designs, and normalization approaches leveraging 
the large population of out-of-band intensities. 

SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online: 
Supplementary Figures 1-6. 
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